NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Phoneme Hallucinator: One-Shot Voice Conversion via Set Expansion

https://doi.org/10.1609/aaai.v38i13.29411

Shan, Siyuan; Li, Yang; Banerjee, Amartya; Oliva, Junier B (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

Voice conversion (VC) aims at altering a person's voice to make it sound similar to the voice of another person while preserving linguistic content. Existing methods suffer from a dilemma between content intelligibility and speaker similarity; i.e., methods with higher intelligibility usually have a lower speaker similarity, while methods with higher speaker similarity usually require plenty of target speaker voice data to achieve high intelligibility. In this work, we propose a novel method Phoneme Hallucinator that achieves the best of both worlds. Phoneme Hallucinator is a one-shot VC model; it adopts a novel model to hallucinate diversified and high-fidelity target speaker phonemes based just on a short target speaker voice (e.g. 3 seconds). The hallucinated phonemes are then exploited to perform neighbor-based voice conversion. Our model is a text-free, any-to-any VC model that requires no text annotations and supports conversion to any unseen speaker. Quantitative and qualitative evaluations show that Phoneme Hallucinator outperforms existing VC methods for both intelligibility and speaker similarity.
more » « less
Full Text Available
NRTSI: Non-Recurrent Time Series Imputation

https://doi.org/10.1109/ICASSP49357.2023.10095054

Shan, Siyuan; Li, Yang; Oliva, Junier B. (June 2023, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

Full Text Available
Distribution-based sketching of single-cell samples

https://doi.org/10.1145/3535508.3545539

Baskaran, Vishal Athreya; Ranek, Jolene; Shan, Siyuan; Stanley, Natalie; Oliva, Junier B. (August 2022, Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)

Full Text Available
Transparent single-cell set classification with kernel mean embeddings

https://doi.org/10.1145/3535508.3545538

Shan, Siyuan; Baskaran, Vishal Athreya; Yi, Haidong; Ranek, Jolene; Stanley, Natalie; Oliva, Junier B. (August 2022, Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics)

Full Text Available

Search for: All records